Characterizing mixing and measurement in quantum mechanics 
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What fundamental constraints characterize the relationship between a mixture p = Pipi of 
quantum states, the states pi being mixed, and the probabilities pi? What fundamental constraints 
characterize the relationship between prior and posterior states in a quantum measurement? In this 
paper we show that there are many surprisingly strong constraints on these mixing and measurement 
processes that can be expressed simply in terms of the eigenvalues of the quantum states involved. 
These constraints capture in a succinct fashion what it means to say that a quantum measurement 
acquires information about the system being measured, and considerably simplify the proofs of many 
results about entanglement transformation. 

PACS Numbers: 03.65.Bz, 03.67.-a 



I. INTRODUCTION 

Quantum mechanics harbours a rich structure whose 
investigation and explication is the goal of quantum in- 
formation science . At present only a limited under- 
standing of the fundamental static and dynamic prop- 
erties of quantum information has been obtained, and 
many major problems remain open. In particular, we 
would like a detailed ontology and quantitative methods 
of description for the different types of information and 
dynamical processes possible within quantum mechanics. 
An example of the pursuit of these goals along a specific 
line of thought has been the partial development of a 
theory of entangled quantum states; see for example the 
work in 

The purpose of the present paper is to pose and par- 
tially solve two fundamental problems about the static 
and dynamic properties of quantum information. The 
first of these problems is to characterize the process of 
mixing quantum states. More precisely, if p = J^iPiPi 
is a mixture of quantum states pi with probabilities pi, 
what constraints relate the properties of p to the proba- 
bility distribution Pi and the quantum states pi? The sec- 
ond problem is to characterize the relationship between 
the prior and posterior states in a quantum measurement. 
The result of our investigations is a set of two static con- 
staints on mixtures of quantum states, two dynamic con- 
straints on the quantum measurement process, and two 
partial converse results, one to the static constraints, and 
the other to the dynamic constraints. The statement of 
each of these results is rather easily understood, so we re- 
view the statements now, before proceeding to the proofs 



and consequences in the main body of the paper. 

Suppose we mix a set of quantum states pi according 
to the probability distribution pi . Then we will show that 
this mixing process must satisfy the constraint equations: 
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In these equations the notation denotes a direct sum 
of vectors, X{X) denotes the vector of eigenvalues of the 
matrix X arranged so the components appear in non- 
increasing order, and the relation is the majorization 
relation^. As an example of the notation used in (||), 
suppose pi — 1/3, p2 = 2/3, pi — diag(3/4, 1/4) and 
P2 — diag(l/5, 4/5). Then Equation (0) becomes 
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A formal definition of majorization appears in Subsec- 
tion 



II B, however for now the essential intuition to grasp 
is that the relation x ^ y means that the vector x is more 
"mixed" (or "disordered") than y. Thus, Equation d|) 



^Note that the vectors on the left and right hand sides 
in ^ may be of different dimension; in such cases we extend 
whichever vector is of lesser dimension by padding it with zero 
entries, to enable comparison using the majorization relation. 



1 



captures the intuition that J^iPiPi is more mixed, on 
average, than the states pi appearing in the ensemble. 
The intuition behind is a httle more complex. Imag- 
ine that we prepare the state p by randomly choosing 
a value for i according to the probability distribution 
Pi, and then preparing the corresponding state pi. Our 
quantum state, including a description of i, may be writ- 
ten as ^ Pi- We then "throw away" the state 
\i) representing our random choice of i, leaving only the 
state ^iPiPi- The relation expresses the fact that 
when we throw away i, the state of the quantum system 
becomes less disordered. 

Suppose we perform a measurement on a quantum me- 
chanical system initially in the state p, obtaining mea- 
surement result i with probability pi, and corresponding 
posterior state p[. What constraints are placed on the 
relationship between p, pi and p'^l We will show that the 
following two dynamic constraints must be satisifed: 



Xip)^J2P^Xip'i) 

i 



(5) 
(6) 



The intuition behind is that quantum measurements 
acquire information about the state of the system being 
measured, and thus after measurement the state of the 
system is less mixed, on average, than before. The in- 
tuition behind is a little more complex, but can be 
understood using Zurek's approach ||l3| to decoherence 
and quantum measurement. Recall that in this approach 
a measurement involves three systems: the system being 
measured, which starts in the state p, and ends in the 
state p^; a measuring device, which starts in some stan- 
dard state, and finishes in a "pointer state" |i) recording 
the result of the measurement, and an environment which 
"decoheres" the measuring device, ensuring that it be- 
haves in an essentially classical fashion. The system and 
measuring device interact unitarily during the measure- 
ment, ensuring that there is no change in the amount of 
disorder present in the system. The subsequent environ- 
mental decoherence process can also be thought of as a 
type of measurement, in which the different outcomes are 
averaged over. In this view, the environment continually 
measures the state of the measuring apparatus, resulting 
in a final state Pi\i)m®Pi for the measuring apparatus 
and system being measured. This decoherence process 
causes an increase in the disorder present in the system, 
which is the intuition behind More succinctly, (|^) 
may be thought of as capturing the notion that the total 
ensemble of possible quantum states is more disordered 
after a measurement than it is before. 

The importance of the static constraints (|l])-(||) and 
the dynamics constraints (|^)-(|^) is further reinforced by 
the fact that in each case there is a type of converse to 
these equations. In this introduction we focus only on 



the more interesting case of the converse to the dynamic 
constraints (|^) and (^), however rather similar remarks 
hold also for the static constraints (|l|) and (||). Suppose 
Pi is a probability distribution, and p and p^ are quantum 
states such that 



\ip)-<J2P^^iP^) 



(7) 



Then we will show that there exists a quantum measure- 
ment whose measurement outcomes may be labelled by a 
pair of indices {i,j) , such that for any fixed i and for all j 
the posterior state of the quantum system after measure- 
ment is Pi, and the probabilities pij for the (i, j)th mea- 
surement outcome satisfy J2jPij — Pi- Unfortunately, 
this result is not a tight converse to equations (||) and (||) , 
due to the introduction of the extra index j, however for 
many purposes it is a sufficiently strong converse. We will 
show that even the equations (||) and together do not 
completely characterize the quantum measurement pro- 
cess, however I believe it likely that there is a simple char- 
acterization of the measurement process along similar 
lines that may be expressed entirely in terms of the eigen- 
values of the prior and posterior states, and the probabil- 
ities of the different measurement outcomes. Of course, 
it is true that the quantum measurement formalism al- 
ready provides such a characterization, in the form of a 
matrix equation, however equations such as (^) and (|^) 
provide far more explicit information, and as such, are 
likely to be more useful in practice. We will demonstrate 
the utility of this approach by application to the problem 
of entanglement transformation, simplifying the proofs of 
several known results about entanglement transformation 

There is a striking level of symmetry in the equa- 
tions (H)-(|^), which we will also see in the partial 
converse results. It is obviously tempting to suggest that 
this reflects some deeper underlying principle, much as 
Maxwell's equations may be derived from a deeper action 
principle based on the Faraday tensor, or the still deeper 
principles of gauge invariance and relativity. Unfortu- 
nately, I have not yet succeeding in obtaining a satisfac- 
tory form for such a deeper principle. Presumably, such 
a deeper principle might assist in tightening the partial 
converse results, or perhaps tightening the partial con- 
verses may shed light on the origin of Equations (0)-(|), 

In explaining the intuitive meanings of the equa- 
tions and (||)-(^) we have used language such as 
the "disorder" present in a quantum state. One might 
wonder if it is possible to write down entropic state- 
ments capturing these intuitions. Wc will show that each 
of these equations in fact implies an entropic statement 
whose content corresponds to the intuition we have de- 
scribed. Of course, entropic statements should really only 
be interpreted in the aymptotic limit where we have a 
large number of identical copies of a system available; 
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the advantage of Equations and (||)-@ is that 

they are stronger forms of these asymptotic statements 
which may be apphed to single quantum systems. 

This paper contains six fundamental results (together 
with a number of applications), expressed in the four 
constraint equations, rtT])-(p|), (||)-(H), and the partial con- 
verses to (|l|)-(§) and (^-(IP. We now review antecedents 
of these results in the existing literature. Equation (|^) is 
an elementary consequence of classic results in the the- 
ory of majorization. Equation (0) follows as a corollary 



of work of Uhlmann |15|, Ruskai (unpublished, 1993) and 
Nielsen on the relationship between mixed states and 
probability distributions. Equations (|^) and (^) are im- 
plicit in the work of Vidal on entanglement transfor- 
mation, and the partial converse to (^)-(il) is implicit 
in the work of Jonathan and Plenio |9[| on entangle- 
ment transformation, building on earlier work by Nielsen 
Q . A proof of Equation (||) in the context of entangle- 
ment transformation has also been previously obtained 
by Jonathan, Nielsen, Schumacher and Vidal (unpub- 
lished, 1999). There are several advantages to the point 
of view taken in the present paper. First, measurement 
is in some sense a more fundamental process than en- 
tanglement transformation, and Equations (|^) and (||) 
highlight the fundamental connection between measure- 
ment and majorization for the first time, incidentally ex- 
plaining why there is a connection between entanglement 
transformation and majorization: it arises as a result of a 
deeper connection between measurement and majoriza- 
tion. Second, the proofs in the present paper are novel, 
and have the advantage of proceeding from a more uni- 
fied point of view than earlier work. As a result they 
are, perhaps, more elegant and informative than earlier 
proofs, especially the proof of the partial converse to (H)- 
(^ , which is a substantial improvement of and extension 
to existing constructions. Several other items of related 
work are also worth pointing out. There is a substantial 
mathematical literature on the problem of characteriz- 
ing the properties of sums A + B oi Hcrmitian matrices 
A and B, and Fulton ||l^ has written a nice review of 
recent progress on this problem, which is closely related 
to the problem of mixing of density matrices. Hardy fl^ ] 
has introduced techniques in the context of entanglement 
transformation that can be used to prove d) and the par- 
tial converse to (|1)-(|1). Fuchs and Jacobs (unpublished, 
2000) have obtained a beautiful and quite different proof 
of (||), after hearing of the result from Nielsen. Finally, 
the procedure described in this paper to prove the partial 
converse to (||)-(H) is a generalization of the procedures 
for entanglement transformation for pure states found 
by Nielsen in Q , and subsequently improved in indepen- 
dent work by Hardy, Jonathan and Nielsen (described in 
Chapter 12 of |^]), by Jensen and Schack jl^, and by 
Werner (unpublished, 2000). 

The paper is structured as follows. We begin in Sec- 
tion |l| by reviewing the two main tools that will be used 



in this paper, the theory of generalized measurements in 
quantum mechanics, and the mathematical theory of ma- 
jorization. Section III contains proofs of the static con- 
straints (|l|) and (^) on the mixing of quantum states, and 
the dynamic constraints (|^) and (^ on quantum mea- 
surement, and explores some elementary consequences of 
these results. In Section we prove the partial con- 
verses to (Q)-(|]) and (||)-(|£)- Section ^ explains how the 
results of the present paper may be used to obtain sim- 
plified proofs of known results about entanglement trans- 



formation. Finally, Section VI concludes the paper with 



a discussion of some open problems and future directions. 



II. GENERALIZED MEASUREMENTS AND 
MAJORIZATION 



Before proceeding to the main results of the paper it 
is useful to first review some background material on 
generalized measurements and the mathematical theory 
of majorization. All discussion in this and succeeding 
sections is to be understood in the context of finite- 
dimensional vector spaces, although infinite-dimensional 
modifications seem likely to hold, perhaps with some 
technical modifications. 



A. Generalized measurements 



In this paper we use the generalized measurements for- 
malism as our basic tool for the description of quan- 
tum measurements. The theory of generalized quantum 
measurements is an extension of the projective measure- 
ments described in most quantum mechanics textbooks. 
The reason the generalized measurements formalism is 
adopted is because it is bettter adapted to the description 
of many realistic quantum measurement schemes. How- 
ever, it is important to appreciate that the generalized 
measurement formalism follows from standard quantum 
mechanics, in the sense that any generalized measure- 
ment can be understood as arising from the combina- 
tion of unitary evolution and a projective measurement, 
a correspondence made explicit below. Nevertheless, the 
formalism of generalized measurements is in many ways 
more useful and mathematically elegant than the stan- 
dard formulation of quantum measurement in terms of 
projectors. More detailed introductions to the theory of 
generalized measurements may be found in ^9 20 l|,pl[|. 

Mathematically, a generalized measurement is speci- 
fied by a set {Ei} of measurement matrices satisfying 
the completeness relation EjEi = I. The index i 
on the measurement matrices is in one-to-one correspon- 
dence with the possible outcomes that may occur in the 
measurement. The rule used to connect the measurement 
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matrices to physics is that if the prior state of the quan- 
tum system is p then the outcome i occurs with proba- 
bihty Pi — ti{EipEj), and the posterior state is given by 
p', - E^pElME.pEj). 

Generahzed measurements are obviously more general 
than the projective measurements described in most text- 
books. Projective measurements have the feature that 
they are repeatable, in the sense that if one performs a 
projective measurement twice in a row on a quantum sys- 
tem, then one will obtain the same result both times. By 
contrast, most real measurements don't have this fea- 
ture of being repeatable, which tips us off to the need 
for the formalism of generalized measurements. Nev- 
ertheless, even the generalized measurement formalism 
can be understood in terms of projective measurements 
as follows: the effect of a generalized measurement on a 
quantum system is equivalent to a unitary interaction be- 
tween the system being measured and another "ancilla" 
system, followed by a projective measurement on the an- 
cilla system. More precisely, suppose {Ei} is a set of 
measurement matrices satisfying the completeness rela- 
tion J2i ^i^i = ^- We introduce an ancilla system with 
orthonormal basis elements \i) indexed by the possible 
measurement outcomes. Define a matrix U acting on the 
joint quantum system-ancilla by the action: 

t/|V')|o)^^i?,|V)N), (8) 

i 

where |0) is some standard state of the ancilla and 
is an arbitrary state of the quantum system being mea- 
sured. It is easy to show using the completeness rela- 
tion T,^ElEi = I that U can be extended to a uni- 
tary matrix acting on the entire state space of the joint 
system. Suppose we perform the unitary transforma- 
tion U on the joint quantum system-ancilla, and then 
do a projective measurement of the ancilla in the \i) ba- 
sis. It is then easily checked that the result of the mea- 
surement is i with probability pi = ti (EipEj) and the 
corresponding post-measurement state of the system is 
p- = EipEj /ti-{EipEj). Thus, the effect on the quan- 
tum system is exactly as we have described above for 
a generalized quantum measurement. Conversely, it is 
not difficult to verify that the effect of a unitary interac- 
tion between system and ancilla followed by a projective 
measurement on the ancilla can always be understood 
in terms of a generalized measurement (see for example 
Chapter 8 of §). 

B. Majorization 

Our primary tool in the study of mixing and measure- 
ment in quantum mechanics is the theory of majoriza- 
tion, whose basic elements we now review. The following 
review only covers elementary aspects of the theory of 



majorization, and the reader is referred to Chapters 2 
and 3 of |2|] , or |2j] for more extensive background. 

The basic motivation for majorization is to capture 
what it means to say that one probability distribution is 
"more mixed" than another. Suppose x = {xi, . . . , Xd) 
and y = (yi, . . . , yd) are two d-dimensional real vectors; 
we usually suppose in addition that x and y are prob- 
ability distributions, that is, the components are non- 
negative and sum to one, but the following definitions 
apply in the case of general x and y as well. The relation 
X ^ y, read "x is majorized by y" , is intended to capture 
the notion that x is more mixed (i.e. disordered) than 
y. To make the formal definition, we introduce the nota- 
tion J, to denote the components of a vector rearranged 
into non-increasing order, so x^ = (a:|, . . . , x^), where 
x\ > X2 > ■ ■ ■ > xj^. We say that x is majorized by y and 
write X ^ y, if 

for k — 1, and with the inequality holding with 

equality when k = d. 

It is perhaps not so clear how this definition connects 
with any natural notion of comparative disorder. We 
will state but not prove a remarkable result connecting 
majorization to a natural notion of mixing. It can be 
shown (see Chapter 2 of j2^) that x < y li and only if 
X = J^iPi^iV^ where the piS form a probability distribu- 
tion and the PiS are permutation matrices. Thus, when 
X ^ y we can imagine that y is the input probability dis- 
tribution to a noisy channel which randomly permutes 
the symbols sent through the channel, inducing an out- 
put probability distribution x. From this characteriza- 
tion many other important results follow with minimal 
effort; for example, it can easily be shown that if x ^ y 
then the Shannon entropy of the distribution x must be 
at least as great as that of y. 

The connection between majorization and quantum 
mechanics arises primarily as a result of Horn's lemma 
(proved in [^; for a simple proof see jl^), which states 
that X ^ y if and only if there exists a unitary matrix 
u = (uij) such that Xi = \uij\'^yj. This fundamental 
relationship between majorization and unitarity ensures 
many close connections between majorization and quan- 
tum mechanics. 

As an elementary consequence of Horn's lemma we 
have Ky Fan's maximum principle, which states that for 
any Hermitian matrix A, the sum of the k largest eigen- 
values of A is the maximum value of tr(^P), where the 
maximum is taken over all /c-dimensional projectors P, 

fc 

^Aj(^) = maxtr(AP). (10) 
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To see this, note that choosing P to be the projec- 
tor onto the space spanned by the k eigenvectors of 
A with the k largest eigenvalues results in tr{AP) = 

'^i(^)- The proof of Ky Fan's maximum princi- 
ple will be completed if we can show that tr{AP) < 

^ji^) foi' ^'^y fc-dimensional projector P. To see 



\ed 



this, let |ei), 
such that P 

orthonormal set of eigenvectors for A, ordered so the cor- 
responding eigenvalues are in non-increasing order. Then 



be an orthonormal basis chosen 
ELi |efc)(e/c|- Let . . . , be an 



(11) 



k=l 



where Ujk = {ejlfk) is unitary. By Horn's lemma it fol- 
lows that {{ej\A\ej)) -< X{A), which implies that 



tr(AP)=^(e,|A|e,) <^A,(A), 



(12) 



as required. 

Ky Fan's maximum principle gives rise to a useful con- 
straint on the eigenvalues of a sum of two Hermitian ma- 
trices, that X{A + B) ^ X{A) + X{B). To see this, choose 
a /c-dimensional projector P such that 

k 

^Xj{A + B)=ti{{A + B)P) (13) 
i=i 

= tr(AP) + ti- {BP) (14) 

k k 

<J2X,{A)+Y,X,{B), (15) 

where the last line also follows from Ky Fan's maximum 
principle. 

Another consequence of Horn's lemma is that given a 
density matrix p and a probability distribution pi there 
exist pure states such that p = J2iPi\'^i){'^i\ if ^^'^ 
only if {pi) -< X{p) (see |l6| , p^ ; this result was also ob- 
tained in unpublished work by Ruskai (1993)), where 
it is understood that if the vector (pi) contains more 
terms than the vector X{p) then the vector X{p) is to be 
"padded" with extra zero terms. The proof of this result 
is simply to combine Horn's lemma with the classification 
of ensembles {pi, consistent with a given density ma- 
trix p, as discovered independently by Schrodinger [ p6[ , 
Jaynes and Hughston, Jozsa and Wootters See 
for the details of the proof. 

This notion of "padding" vectors of unequal dimension 
so they can be compared by the majorization relation is 
surprisingly useful, and we adopt the general convention 
that when x and y are of different dimension then x ^ y 
means that x ^ y, where x and y are padded with extra 
zero components to ensure that they have the same di- 
mension. For example, (1/3,1/3,1/3) (1/2,1/2) since 



(1/3,1/3,1/3) -< (1/2,1/2,0). It is easy to check that 
this extended notion of majorization is well-defined, pro- 
vided X and y both have non-negative components, and 
this will be the case for all the applications in this pa- 
per. Similarly, it is often useful to write x = y provided 
the padded versions of x and y are equal, that is, the 
non-zero entries of x and y are equal. With these con- 
ventions, it is easy to see that algebraic manipulations 
proceed exactly as one would expect. For example, for 
non-negative real vectors w,x,y,z\i'w^x,x = y,y^z 
then obviously w ^ z, even if all four vectors have dif- 
ferent dimensionality. We occasionally make use of such 
elementary observations in proofs, without explicit com- 
ment. 

The final result about majorization we shall need is 
that if Pi are a set of orthogonal projectors such that 
J2i Pi = I, and p is a density matrix, then |Q 




-< X{p). 



(16) 



Intuitively, if a projective measurement of a quantum 
system is performed, but we do not learn the result of 
the measurement, then the state of the system after 
measurement is more mixed than it was before. One 
way of proving this relation is via Horn's lemma; a 
sketch follows. First, note that it suffices to prove that 
X{PpP + QpQ) -< A(p), where P and Q = / - P are 
two orthogonal projectors satisfying P + Q = I. Once 
this is proved, the general relation (16) follows by a sim- 
ple induction. However, if we define a unitary matrix 
U = P — Q then it is easy to verify that 



PpP + QpQ = 



UpU^f 



(17) 



Applying Horn's lemma and the easily proved fact that 
if xi ^ y and X2 < y then [xi ^xi^jl -< y, it follows with 
a little simple linear algebra that X{P pP + Q pQ) -< X(p). 



III. PROOF OF CONSTRAINTS ON MIXING 
AND MEASUREMENT IN QUANTUM 
MECHANICS 



In this section we prove the four constraints, (|^)-(||), 
(^)-(^). The first and second of these are static con- 
straints on the mixing of quantum states, proved in Sub- 
section |III A . The third and fourth constraint equations 
are dynamic constraints on the quantum measurement 

Finally, some sim- 



process, proved in Subsection III B 



pie consequences of these results are dicussed in Subsec- 



tion III C 
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A. Static constraints on mixing quantum states 

Theorem 1: Suppose p — J^iPiP-i ^ convex combi- 
nation of quantum states pi with probabilities pt. Then 



i 



(18) 
(19) 



Proof of (0): 

the fact that X{A 
mitian matrices A and B, as proved in Subsection [I B 



This is an immediate consequence of 
^ B) ^ X{A) + X{B) for any two Her- 



Proof of (19): As noted in Subsection [IB, if a den- 



sity matrix p can be written as a convex combination of 
pure states jV'i), p — '^iPi\'>Pi){ipi\, then it foUows that 
(pi) -< X{p), where (pi) denotes the vector whose entries 
are the probabiHties pi. Equation (|l9|) is a corollary of 
this result. To see this, note that if rij are the eigenvalues 
of Pi and the corresponding orthonormal eigenvec- 
tors then (|9|) is equivalent to the equation 



iP^r^j) -< X{p), 



(20) 



which follows from the results of Subsection II B and the 
observation that 



= ^PiPt ^Pinj\i,j){i,j\ 



(21) 



This completes the proof of Theorem 1. 



B. Dynamical constraints on quantum measurement 

Theorem 2: Suppose {Ei} is a set of measurement 
matrices satisfying the completeness relation J^i ^l^i — 
I. Then the quantum measurement described by these 
matrices must satisfy the following four constraints: 



xi^E^pE]^ <Y.x{e,pEI 
@x{e,pE]) ^xl^E.pEl^ 

x{p) <Y.^(e,pe]) 

i 

^x(e,peI) ^A(p). 



(22) 

(23) 
(24) 
(25) 



A slightly different way of stating Theorem 2 is to de- 
fine Pi to be the probability of obtaining outcome i when 
the measurement defined by the matrices {Ei} is per- 
formed on the system, and let p'^ — EipEj /tr{EipEj) be 
the corresponding posterior states. Then the following 
four equations are equivalent to ([2^)-(25): 



0p.a(pO <xiY^p,p[ 
X{p) -<J2pA{p'.) 

i 

0P.A(pO -<A(p). 



(26) 

(27) 
(28) 
(29) 



Theorem 2 is a fundamental constraint on the dy- 
namics that may occur during a quantum measurement. 
Equations ( p6| ) and ( p7| ) are, of course, merely the dy- 
namical expression of the static constraints found earlier 
in Theorem 1. Equations (p8|) and (29) represent novel 



constraints of an essentially dynamical nature, connect- 
ing as they do the prior and posterior states of the quan- 
tum measurement. Intuitively, Equation (^) captures 
the notion that a quantum measurement "gains informa- 
tion" (on average) about a quantum state, since it says 
that the eigenvalues of the initial state p are, on aver- 
age, more disordered than the eigenvalues of the posterior 
states p'^. Intuitively, the second dynamic constraint, ( |29| ) 
captures the notion that the total ensemble of possible 
quantum states is more disordered after the measurement 
than before. Thus, ( p8| ) and ( p9| ) represent complemen- 
tary constraints on the evolution of a quantum system 
during a quantum measurement process. 

The constraints (26)-(p9|) are applicable even for very 



complex measurement processes. For example, a single 
mode cavity undergoing direct photodetection by an ideal 
photodetector can be described by a special case of the 
generalized measurements formalism known as the quan- 
tum trajectories or stochastic Schrodinger equation pic- 
ture (see [^,^ for a review and references). In this 
picture, if the system is started in the state p then the 
final state of the system is ph, where "/i" is used here 
to denote not just a single measurement outcome, but 
rather the complete history recorded by the photodetec- 
tor, that is, all the times at which photocounts occurred. 
Then ( p8|) and (^) may be written as 



(30) 
(31) 



A(p) -< J d^Ji{h)X{pn) 

^d^x{h)X{pu)<x{pl 



where the integral is a functional integral over all possible 
photodetection histories, and dp.{h) is the corresponding 
measure on histories. 

Proof of Theorem 2: The first two equations of The- 
orem 2, (^2|) and (p3|), are immediate consequences of 
the deeper static constraints on quantum mechanics in- 
troduced in Theorem 1; here we are merely enumerating 
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the implications these static constraints have for dynam- 
ics. The remaining constraints, ( p^ ) and (p5|), are gen- 
uine quantum dynamical constraints relating the prior 
and posterior states of a quantum measurement. 

Proof of (^): Suppose p is a positive matrix which 
can be written in the block form: 



P = 



A X 
Xt B 



(32) 



For our purposes p will most often be a density matrix 
(and thus satisfy tr(/o) = 1), but the results we prove 
hold for a general positive matrix. We will show that 
\{p) -< X{A) + \{B). (Recall our conventions on padding, 
which imply that the vectors of eigenvalues for A and B 
are to be extended by zeroes in such a way that they 
contain as many entries as the vector of eigenvalues of 
p). p is a positive matrix, so there must exist a matrix 
D = [Di\D2\ such that p = D^D, where the matrices Di 
and D2 have the same number of columns as A and B, 
respectively, and both have the same number of rows as 
p. Thus we have 



A X 
X^ B 



= D^D = 



d\Di d\D2 
dIDi dID2 



(33) 



from which we read oS. A — d\Di and B ~ d\D2. Us- 



ing the results of Subsection |II B| and the fact that the 
eigenvalues of a product EF of matrices E and F are the 
same as the eigenvalues of FE, up to padding by zeroes, 
we see that 



\{p) = \{D^D) 
= \{DD^) 
= X{DiDI + D2DI) 
< X{DiDI) + \{D2DI) 

= \{d\Di) + \{dId2) 
= \{A) + \{B), 



(34) 
(35) 
(36) 
(37) 
(38) 
(39) 



and thus \{p) -< X{A) + X{B), as claimed. This method 
for eliminating off-diagonal block terms was introduced 
by Wielandt to connect the Weyl and Aronszajn inequal- 
ities (cited as ^] in Chapter 3 of p^.) 

As a straightforward consequence we see by induction 
that for any positive matrix p and complete set of or- 
thogonal projectors {Pi}- 



A(p) ^^A(P,pP,) 



(40) 



c/|7^)|o)-^£;.|V')|z), 



(41) 



where |0) is some standard state of the ancilla. Then we 
have X{p) — X{p(^ \0){0\), since the non-zero eigenvalues 
of p and |0)(0| arc the same. Simple algebra and ( |40| ) 
imply that 

A(p) = A(C/(p®|0)(0|)C/t) (42) 
^ ^ A((/® K)(z|)[/(p® |0)(0|)f7t(/® |^)(^|)) (43) 



= J2KE^PE}®\^){^\) 

i 



(44) 



(45) 



where in the last line we used X{EipEl (g) = 
X{EipEj), since the non-zero entries agree. This com- 
pletes the proof of (p4|). 

Proof of (25): Again, let U be the unitary matrix 
constructed in Subsection II A to implement the mea- 
surement described by the measurement matrices {Ei}, 
namely, any unitary matrix having the action 



;7|V)|o) = ^i?,|V^)|z). 



(46) 



Again, we have A(p) — X{p (E) |0)(0|), since the non-zero 
eigenvalues of p are the same as those of p (g) |0)(0|, and 
thus A(p) = A (f/(p(g) |0)(0|)f/t). It follows from Equa- 
tion (|l^) that 

x(J2{I^\^)m{p^\0){0\)UHI^\^}{^\)]^X{p), (47) 



and thus 



x(^E,pEj^\i){z\^ ^ A(p). 



(48) 



This last equation is obviously equivalent to the state- 
ment we set out to prove, 



0A(i?,pi?|J ^ A(p), (49) 

i 

which concludes the proof of Theorem 2. ■ 



Extending even further, suppose {Ei} is any set of mea- 
surement matrices defining a generalized measurement, 
and p is a positive matrix. As in Subsection [I A we can 
introduce an ancilla system with an orthonormal basis 
\i) in one-to-one correspondence with the indices on the 
measurement matrices Ei and define a unitary matrix U 
which has the action 



C. Consequences of the constraint equations 

The constraints proved in Theorems 1 and 2 are very 
strong and, not surprisingly, have many interesting con- 
sequences. We now elucidate a few of these consequences 
using the notions of Schur- concavity and Schur- convexity. 
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A Schur-convex function /(•) is a real-valued function 
which preserves the majorization relation, in the sense 
that if x y then f{x) < f{y). Simple necessary 
and sufficient conditions for a function to be Schur- 
convex are known [ p2[ , and many interesting functions 
are Schur-convex. These include, for example, the func- 
tion X — > f{x) = X]j=i^j' f*^^ ^'^y k > 1. Similarly, 
a Schur- concave function /(•) is one such that \i x ^ y 
then f{x) > f{y)- Equivalently, /(•) is Schur-concave 
if — /(•) is Schur-convex. Perhaps the canonical exam- 
ple of a Schur-concave function is the Shannon entropy 
H{x) — — J2j lc>g2(a;j), so that whenever x ^ y it fol- 
lows that H{x) > H{y), giving further justification to the 
intuitive notion that x ^ y means that x is more disor- 
dered than y. Applying the Schur-concavity of Shannon's 
entropy to the results of Theorems 1 and 2 we obtain 
an attractive suite of results. First, applying the Schur- 
concavity of H{-) to gives 



(50) 



Applying the concavity of the Shannon entropy to the 
right hand side, we obtain as a corollary the concavity of 
the von Neumann entropy, 



S{p) > ^PiS{pi). 



(51) 



Applying the Schur-concavity of H{-) to (|l^) and doing 
some simple algebra gives 



J2p^Sip^) + H{p,)>S{p). 



This result was obtained previously by Lanford and 
Robinson using different techniques. Applying the 
Schur-concavity of H{-) to (p8|), followed by the concav- 
ity of the Shannon entropy, gives 



Sip)>Y.P^S{p',). 



(53) 



Essentially the same result has been obtained previously 
in the context of entanglement transformation |^ , where 
it expresses the fact that local processes cannot increase 
the amount of entanglement present in a system. Fi- 
nally, applying the Schur-concavity of H{-) to ( p9|) gives 
the beautiful inequality 



H{p,) + J2P^S{p',)>S{p), 



(54) 



which implies that in order to lower the entropy of a sys- 
tem by an amount A, on average, the information H{pi) 
collected by the measurement must be at least as large 
as A. This fact can be seen as a quantum mechani- 
cal expression of the principle, expressed by Landauer 



in and fleshed out by Bennett Q and Zurek that 
measurement of a physical system carries with it a ther- 
modynamic cost when the measurement record is erased, 
and proper accounting of this cost enables one to solve 
the conundrum posed by Maxwell's demon. (See ||3^ for 
a review.) 

Applying the Schur- convexity of the functions f{x) = 
a;f for fc > 1 to the results of Theorems 1 and 2 also 
give a number of interesting constraints. The arguments 
used are analogous to those given above for the Shannon 
entropy, so the details will be omitted, and we merely 
state the results: 

5]pftr(pf) <tr(p'=) <^p,tr(p,f) (55) 

i i 



IV. PARTIAL CONVERSES TO THE 
CONSTRAINTS ON MIXING AND 
MEASUREMENT 

Given the constraints on mixing and measurement de- 
scribed in Theorems 1 and 2 it is natural to ask if these 
constraints completely characterize the processes of mix- 
ing and measurement, respectively. We will show below 
that the answer to this question is no. However, par- 
tial progress towards achieving simple characterizations 
of mixing and measurement may be reported in the form 
of a partial converse to Theorem 1, described below in 



(52) Subsection IVA, and a partial converse to Theorem 2 



described in Subsection [VB 



A. Partial converse to the constraints on mixing 

Given the constraints Theorem 1 imposes on mixing 
it is natural to ask whether these constraints completely 
characterize the mixing process. That is, given a den- 
sity matrix p, probabilities pi and vectors with non- 
negative, non-increasing components which sum to one, 
and such that 



i 

^PiK -< A(p), 



(57) 
(58) 



does it follow that there exist density matrices pi such 
that X{pi) = and p = Y^iPiPi^ 

We will show below that the answer to this question is 
no, however I suspect that some characterization along 
similar lines is possible. Progress towards such a char- 
acterization can be reported in the form of a partial 
converse to Theorem 1, which states that provided ([5^) 
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holds then there exist states pij and a probability dis- 
tribution Pij such that X{pij) = Xi, independent of the 
value of the index j, and pi = J^jPij each i, as well 
as p = Y], - Pij Pij. That is, in order to obtain a con- 
verse to we need to introduce an extra index, j. We 
will show below that it is necessary to introduce the ex- 
tra index if only (|5^) is assumed as a hypothesis for the 
converse. Let's state and prove the partial converse as 
Theorem 3. 

Theorem 3: Suppose p is a density matrix and Xi are 
vectors with non-negative, non-increasing components 
summing to one. Suppose pi are probabilities such that 



Hp) -< ^P^>'^ 



(59) 



Then there exist density matrices pij and a probability 
distribution pij such that pi = J2jPij^ ^{Pij) — ^ii ^'^'^ 

To prove Theorem 3 we need the result stated in 
Subsection [IB that x < y if and only if there exist 



probabilities qj and permutation matrices Pj such that 
X — IjPjU- Applying this result with the assump- 
tion (|59|) we obtain 



Kp) ^^P^^ljPj^^- 



(60) 



Working in the basis in which p is diagonal, and defin- 
ing Ki to be the diagonal matrix with diagonal entries 
Ai, we may set pij = piqj and pij = PjAiPj , obtain- 
ing Pi = ^jPij and X{pij) = Xi- Finally, the equation 
P ~ IZijPijPij follows immediately from these definition 
and (|60(), completing the proof. 

What of a tight converse to Theorem 1? It is easy 
to see that it is not possible to obtain a tight converse 
to (|5^ ) alone, as follows. Suppose we choose p = 1/2 to 
be the completely mixed state of a single qubit, and de- 
fine a probability distribution on just one outcome, the 
trivial distribution pi — I, with corresponding vector 
Ai = (1,0). Clearly, X{p) ^ X^iK'^ii Y^* it is not possi- 
ble to find a state pi such that p = pipi and X{pi) = Ai. 
Thus, in this example, it is necessary to introduce extra 
indices, just as was done in Theorem 3. 

Might it be that conditions (^^ and ( |5^ ) together 
completely characterize the mixing process? The fol- 
lowing example, due to Julia Kempe, shows that this 
is not the case. Suppose we consider a qubit system, 
and choose p — diag(5/12, 7/12), pi = P2 = 1/2, 
and Ai — (1,0), A2 — (1/2,1/2). It is easy to verify 
that conditions (^^ and ( p8| ) are satisfied with these 
choices. Unfortunately, it is not possible to find states 
pi and p2 with vectors of eigenvalues Ai and A2 such 
that p = pipi +P2P2, since with these choices for Ai and 
A2 it follows that pi must be a pure state and p2 = 1/2 
the completely mixed state, so pipi +P2P2 has eigenval- 
ues 3/4 and 1/4, which are not equal to 5/12 and 7/12. 



Despite this example, I believe it likely that conditions 
along the lines of (|^ and ( |58| ) may be used to completely 
characterize the process of mixing in quantum mechanics. 



B. Partial converse to the constraints on 
measurement 

Given the constraints Theorem 2 imposes on the quan- 
tum measurement process it is natural to ask whether 
these constraints completely characterize the possible 
posterior states and probabilities which may occur in 
such a measurement? That is, supposing p is a density 
matrix, pi is a probability distribution, and p'^ are density 
matrices such that 



A(p) ^^p,A(p^) 

i 

0p.a(pO -<A(p), 



(61) 
(62) 



does it follow that there exist measurement matrices {Ei} 
satisfying the completeness relation EjEi ~ I and 
giving the states p'i as posterior states, with probabili- 
ties Pi, when the measurement is performed on a system 
initially prepared in the state p7 

We will show below that the answer to this question is 
no, however I suspect that some characterization along 
similar lines is possible. Progress towards such a charac- 
terization can be reported in the form of a partial con- 
verse to Theorem 2, which states that provided the re- 
lation (|6l| ) holds, then there is a quantum measurement 
described by measurement matrices {Eij} such that the 
corresponding posterior states Pij satisfy p'ij — pi for 
every j, and the measurement probabilities pij satisfy 
J2j Pij = Pi- Thus, in order to obtain a converse to ( |6l| ) 
we need to introduce an extra index, j, just as we did 
earlier in the partial converse to Theorem 1. Also anal- 
ogously to that case, we show below that it is necessary 
to introduce the extra index with only (|6l| ) as hypoth- 
esis for the converse. Let's state and prove the partial 
converse as Theorem 4. 

Theorem 4: Suppose p is a density matrix with vec- 
tor of eigenvalues A, and ai are density matrices with 
vectors of eigenvalues A^. Suppose pi are probabilities 
such that 



A -< y^^PiXi 



(63) 



Then there exist matrices {Eij} and a probability distri- 
bution Pij such that 



EijpE\^ =PijCFi 



E 



Pij = Pi- 



rn 

(65) 
(66) 
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To prove Theorem 4, we again use the result that x < y 
if and only if there exist probabilities qj and permutation 
matrices Pj such that x — J2j Ij^jV- ^7 assumption we 
have A -< ^iPiK and thus there exist permutation ma- 
trices Pj and probabilities qj such that 



A 



"^p.qjPjX,. 



(67) 



Without losss of generality we may assume that p and 
CTi are all diagonal in the same basis, with non-increasing 
diagonal entries, since if this is not the case then it is 
an easy matter to prepend or append unitary matrices 
to the measurement matrices to obtain the correct trans- 
formation. With this convention, we define matrices Eij 
by 



Eijy/p ; 



(68) 



In order for Eij to be well-defined by this formula alone 
it is necessary that p be invertible. If this is not the 
case then the Eij are defined on the support of p by the 
formula (^8|), and to act as the zero operator on the or- 
thocomplement of the support of p. It is convenient to 
let P be the projector onto the support of p. Note that 
we have 



(69) 



Comparing with ( |67| ) we see that the right-hand side of 
the last equation is just p and thus 



(70) 



from which we deduce that Y.i^ e\-E^ = P, the projec- 
tor onto the support of p. Letting Q = / — P be the 
projector onto the orthocomplement of the support, we 
can append an additional measurement matrix Eqq = Q 
to the collection Eij to ensure that the completeness re- 
ij — I is satisifed. Furthermore, from the 
it follows that 



lation J:,,eIe 
definition ( 



E^,pEl 



(71) 



with A = (1/2, 1/2), and the trivial probability distribu- 
tion on one outcome, pi — 1, with Ai = (1,0). Then 
A -< piAi, but it is clear that there does not exist an 
El such that EipE\ = pi, where A(p) = A, A(pi) = Ai 
and eIEi — I, because the last equation implies that 
El must be unitary. It is not difficult to construct more 
complex examples to convince oneself that this behaviour 
is generic. 

Might it be that the conditions ( |6l| ) and (^) together 
characterize the posterior states and probabilities achiev- 
able through a quantum measurement? The following ar- 
gument, due to Julia Kempe and the author, shows that 
this is not the case. Suppose we consider a qubit system, 
and choose p = diag(5/12, 7/12), pi = p2 = 1/2, and 
p'l = diag(l,0),p2 = diag(l/2, 1/2). It is easy to ver- 
ify that conditions ( pi] ) and ( |6^ ) are satisfied with these 
choices. Unfortunately, it is not possible to find mea- 
surement matrices Ei and E2 satisfying J^i ^jEi = I 
and giving posterior states p'l and with equal proba- 
bilites 1/2, when the state p is measured. This can be 
seen in a variety of ways. A simple direct way is to note 
that the purity of p'l implies that Ei must have the form 
El = a I a) (6 1 for normalized states \a) and |5), and some 
a > 0. Thus 



eIE2 = I- ElEi (72) 
= I-a^\b){b\ (73) 
^{l-a^)\b){b\ + \c){cl (74) 

where |c) is orthonormal to \b). The polar decomposition 



(75) 



gives E2 = U J E2E2 for some unitary U, so 



E2 



^l~a^U\b){b\ + U\c){c\. 



We are requring that E2pE\ = 1/4, so it must be the case 
that E2 is non-singular, and thus a < 1. Prcmultiplying 

gives 



by E2 ^ and postmultiplying by (£"2) ^ 



^-i(T^I^)(^l + Jl^)(^l- 



(76) 



Since \b) and |c) are orthonormal it follows that such a p 
cannot be equal to diag(5/12, 7/12), which is the desired 
contradiction. Despite this example, I believe it likely 
that conditions along the lines of ( |6l| ) and ( |6^ ) may be 
used to characterize the process of measurement in quan- 
tum mechanics. 



and thus upon performing a measurement defined by 
the measurement matrices {Eij} the result occurs 
with probability pij = Piqj, ^jPij = Pi, and the post- 
measurement state is ai. This completes the proof of 
Theorem 4. 

Theorem 4 is not a sharp converse to the condition of 
Equation ( |6l| ) because of the extra index j. Introducing 
some such index is certainly necessary with the present 
hypotheses, as may be seen by considering an example 



V. ENTANGLEMENT TRANSFORMATION 

The problem of entanglement transformation is a natu- 
ral context in which the results of the present paper may 
be applied. The problem of entanglement transforma- 
tion arises as a consequence of the fundamental question 
of how may we convert one type of physical resource into 
another, and there has been considerable effort devoted 
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to determining when it is possible to convert one type of 
entanglement to another. In Q a connection was noted 
between entanglement transformation and majorization, 
namely, that if {ip) and are pure states of a bipartite 
quantum system with components belonging to Alice (A) 
and Bob {B) respectively, then Alice and Bob can trans- 
form the state into the state {(p) using local operations 
on their respective systems and classical communication 
between Alice and Bob, if and only if 

^ A0, (77) 

where (respectively A^) is the vector of eigenvalues 
of the reduced density matrix for Alice's system when 
the joint system is in the state As per usual, 

the components of such vectors are ordered into non- 
increasing order. This result has subsequently been gen- 
eralized by Vidal to the case of conclusive transfor- 
mation, and even further by Jonathan and Plenio to 
the problem where Alice and Bob are supplied with a 
state \ip) and wish to tranform this state into an ensem- 
ble of states in which the state occurs with probabil- 
ity Pi . (See also Hardy for an instructive alternative 
approach to results of this type.) The necessary and suf- 
ficient condition for such a transformation to be possible 
is that g: 

i 

We now explain how this result can be seen as an easy 
consequence of the results proved in the present paper, 
and thus the connection between majorization and en- 
tanglement is really a consequence of a deeper connec- 
tion between majorization and measurement. By a result 
of Lo and Popescu it is possible to transform lip) 
into the ensemble {pt, \4>i)} by local operations and clas- 
sical communication if and only if it is possible to make 
the transformation via the following simplified procedure: 
first, Alice performs a generalized measurement on her 
state, then sends the result to Bob, who performs a uni- 
tary operation on his system conditional on the outcome 
of the measurement Alice made. Let p = trs(|'0)(V'|) 
be the initial state of Ahce's system, and suppose Alice 
performs a quantum measurement described by measure- 
ment matrices Ei, so that outcome i occurs with prob- 
ability Pi and (Ei Ui)\ip) — y^l^i), for some unitary 
operator Ui acting on Bob's system. Considering Alice's 
system alone and observing that that EipE\ — ai, where 
Ci = Pitr(|(/)i)((/)i|), we deduce from Theorem 2 that 

K<^P^^<y.^ (79) 

i 

which is equivalent to ([TS]). To prove the converse, sup- 
pose ( [tS] ) holds. Then by Theorem 4 there exists a quan- 
tum measurement described by measurement matrices 
Eij , and probabilities pij such that 



EijpE^ = p^jai\ '^pij=pi. (80) 
j 

The procedure for Alice and Bob to produce the ensem- 
ble is for Alice to perform the measurement described by 
the set Eij. The post-measurement state \(j)ij) is then a 
purification Q of the state ai, and it can be shown (see 
p8[ or Section 2.5 of that by performing an appro- 
priate unitary transformation Bob can convert the state 
\(f>ij) into the state \4>i), with total probability of ob- 
taining the state Thus Equation ( [78| ) represents a 
necessary and sufficient condition for it to be possible to 
transform the state \Tp) into the ensemble {pi,\(t>i)} by 
local operations and classical communication. 



VI. CONCLUSION 

We have shown that there are strong fundamental con- 
straints on the processes of mixing and measurement in 
quantum mechanics that may be naturally expressed in 
the language of majorization. Although the results in 
the present paper don't completely characterize these 
processes, they suggest that there may exist a simple 
set of conditions which substantially simplify the usual 
characterization of these processes via operator equa- 
tions. Another interesting direction for further research 
is to generalize the constraints on measurements obtained 
in this paper to better understand how two or more 
states may transform simultaneously under a measure- 
ment. Once again, although this problem is in princi- 
ple already "solved", in the sense that there is an op- 
erator equation specifying exactly what transformations 
may occur, results such as those in the present paper 
and in ||3^ indicate that much more explicit character- 
izations may be possible. Such explicit conditions are 
likely to have applications to fundamental problems such 
as the problem of transformation of mixed state entan- 
glement jsj, and to the problem of determining to what 
extent the acquisition of information about the identity 
of a quantum state disturbs the system being measured 
||. 
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